This is a computationally generated report that serves as an overview of US counties. It uses recent COVID-19 case data that is curated by The New York Times and is available in their github repository. We will calculate the percentage of people recently infected in each county and use that to estimate the chance of running into an infected person in a few different scenarios. This project was directly inspired by the work of Joshua Weitz at Georgia Tech.

US counties have reported 358,165 cases in the last 2 weeks


Let’s look at total US cases over time. The country total is shown in red and individual counties are shown in black.

We can see there has been a massive increase in cases in the past two months, and that the US cases have increased due to increases in many counties. The counties show wide variability, and looking closely we can see certain counties that experienced much later or earlier outbreaks.

What is slightly harder to see on the log scale used above, is that US cases are still increasing dramatically. Let’s look at how many cases have been reported for each county in the last two weeks. We’ll spread the counties out by their population size, since bigger counties tend to have more infections.

We can see that a lot of counties are reporting a lot of cases. How can we think about the risk of encountering infected people in each of these counties? Let’s start by calculating the percent of each county’s population that has been reported infected in the last two weeks. We do this by finding the number of new cases reported in the last two weeks and dividing by the estimated 2019 population of the county.

The median US county reported 0.0255% prevalence in the last two weeks


There is wide variation across the country, with certain counties reporting 0%, while other counties are reporting up to 11.8%.

True cases are higher than reported cases


Practically, reported cases are a lower bound for the true cases and we can’t really know the true number of cases in a county. Instead we can try to estimate the true number of cases crudely by looking at the death rate in each county. Sadly, US counties reported a total of 22,265 deaths in the last two weeks. If we assumed cases and deaths were constant and that the death rate of COVID-19 is 1%, then we can estimate true cases as \((\text{deaths} * 100) / \text{cases}\). That would mean the true number of US cases is 6.2x higher than reported.

We can perform the same calculation for every US county to try correcting for the undercounting of true cases.

Currently, most US counties are probably undercounting cases by a lot.

By this imperfect estimate, cases in the median US county are being undercounted by 7.0x. This is not a precise estimate, but hopefully that gives you a sense for the uncertainty in the reported values.

What is the chance of runing into an infected person?


As states reopen, we may wonder, “What is the chance someone around me is infected?” Although the chance that any individual person is infected may remain small, as we interact with more and more people the chances go up dramatically. Below is a plot to explore the chance that a subset of people within this county has at least one person infected. Simply, we can estimate this by calculating the probability that no one is infected. The remaining probability is the chance someone is infected. See here for more details.

This calculation assumes that everyone is all mixed together and makes some other back of the envelope assumptions, but it can help us think about the situations we experience every day.

The answers depend on how many people we are talking about (and the percentage of people infected). As you explore the plot below, think about these questions and how many people you encounter on a normal basis.

The plot below shows the chance that at least one person is infected in a subset of people, depending on the number of people in the subset and the fraction of the population that is infected.

For example, if we expect to see 100 people at the grocery store in a county with 1% infected, what is the chance someone in the store is infected?

The median county has a reported prevalence of 0.0255%, so for a grocery store in that county:

We can’t know what the true chance is, but hopefully this back of the envelope calculation helps you to think about some common scenarios in a reasonable way.

Below is a list of all counties in the US with the data used to construct the above plots. Each entry has a link to a customized report for that county.



So, what is the chance of encountering an infected person?

The chance is rarely zero.


What conclusions can we draw from this data? For most counties, there seems to be a low percentage of people infected. However, as we run into more and more people, the chance that we are encountering infected people is rarely zero. Often it is actually quite likely. Therefore, the safest assumption is that we are interacting with infected people whenever we see more than a few people.

What can we do? People will make their own decisions, but there are some clear actions that we can take to minimize our risk.

  1. Continue to physically distance. The fewer people we interact with the lower the risk. Especially avoid large groups of people in tight spaces.
  2. When we must interact with people, we should assume someone is infected. That means taking reasonable precautions like wearing facemasks, washing hands, and keeping as much distance as possible.

Things may seem normal, but continue to stay vigilant. Good luck.


Project by Scott H. Saunders
Code available here: github repository